Word Embeddings to Enhance Twitter Gang Member Profile Identification
نویسندگان
چکیده
Gang affiliates have joined the masses who use social media to share thoughts and actions publicly. Interestingly, they use this public medium to express recent illegal actions, to intimidate others, and to share outrageous images and statements. Agencies able to unearth these profiles may thus be able to anticipate, stop, or hasten the investigation of gang-related crimes. This paper investigates the use of word embeddings to help identify gang members on Twitter. Building on our previous work, we generate word embeddings that translate what Twitter users post in their profile descriptions, tweets, profile images, and linked YouTube content to a real vector format amenable for machine learning classification. Our experimental results show that pre-trained word embeddings can boost the accuracy of supervised learning algorithms trained over gang members’ social media posts.
منابع مشابه
Finding Street Gang Member Profiles on Twitter
Balasuriya, Lakshika. M.S., Department of Computer Science and Engineering, Wright State University, 2017. Finding Street Gang Member Profiles on Twitter The crime and violence street gangs introduce into neighborhoods is a growing epidemic in cities around the world. Today, over 1.4 million people, belonging to more than 33,000 gangs, are active in the United States, of which 88% identify them...
متن کاملSignals Revealing Street Gang Members on Twitter
We study the problem of automatically finding gang member profiles on Twitter. We outline a process to curate one of the largest sets of verifiable gang member profiles that has ever been studied. A review of these profiles establishes differences in the language, images, YouTube links, and emoji features gang members use compared to the rest of the Twitter population. We generate word embeddin...
متن کاملImproving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings
It has been shown that learning distributed word representations is highly useful for Twitter sentiment classification. Most existing models rely on a single distributed representation for each word. This is problematic for sentiment classification because words are often polysemous and each word can contain different sentiment polarities under different topics. We address this issue by learnin...
متن کاملTwitter Author Profiling Using Word Embeddings and Logistic Regression
The general goal of the author profiling task is to determine various social and demographic aspects of the author based on his pieces of writing. In this work, we propose an approach that combines word embeddings and classical logistic regression for identifying author gender and language variety based on the corresponding tweets. The model was trained on PAN 2017 Twitter Corpus that contains ...
متن کاملSentence Modeling with Deep Neural Architecture using Lexicon and Character Attention Mechanism for Sentiment Classification
Tweet-level sentiment classification in Twitter social networking has many challenges: exploiting syntax, semantic, sentiment and context in tweets. To address these problems, we propose a novel approach to sentiment analysis that uses lexicon features for building lexicon embeddings (LexW2Vs) and generates character attention vectors (CharAVs) by using a Deep Convolutional Neural Network (Deep...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1610.08597 شماره
صفحات -
تاریخ انتشار 2016